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Abstract 

The primary theme of this investigation is a deci- 
sion theoretic account of conditional ought statements 
( e -8-> You ought to do A, if C”) that rectifies glaring 
deficiencies in classical deontic logic. The resulting ac- 
count forms a sound basis for qualitative decision the- 
ory, thus providing a framework for qualitative plan- 
ning under uncertainty. In particular, we show that 
adding causal relationships (in the form of a single 
graph) as part of an epistemic state is sufficient to 
facilitate the analysis of action sequences, their conse- 
quences, their interaction with observations, their ex- 
pected utilities and, hence, the synthesis of plans and 
strategies under uncertainty. 


1 INTRODUCTION 

In natural discourse, “ought” statements reflect two 
kinds of considerations: requirements to act in ac- 
cordance with moral convictions or peer’s expecta- 
tions, and requirements to act in the interest of one’s 
survival, namely, to avoid danger and pursue safety. 
Statements of the second variety are natural candi- 
dates for decision theoretic analysis, albeit qualitative 
in nature, and these will be the focus of our discus- 
sion. The idea is simple. A sentence of the form 
“You ought to do A if C" is interpreted as shorthand 
fpr a more elaborate sentence: “If you observe, be- 
lieve, or know C, then the expected utility resulting 
from doing A is much higher than that resulting from 
not doing A” .* The longer sentence combines several 
modalities that have been the subjects of AI investiga- 
tions: observation, belief, knowledge, probability (“ex- 
pected ), desirability (“utility”), causation (“resulting 
from”), and, of course, action (“doing A"). With the 
exception of utility, these modalities have been for- 
mulated recently using qualitative, order-of-magnitude 
abstractions of probability theory (Goldszmidt k Pearl 
1992, Goldszmidt 1992). Utility preferences them- 


1 alternative interpretation, in which “doing A" is 

required to be substantially superior to both “not doing A” 
and “doing not- A” is equally valid, and could be formulated 
as a straightforward extension of our analysis. 


selves, we know from decision theory, can be fairly 
unstructured, save for obeying asymmetry and tran- 
sitivity. Thus, paralleling the order-of-magnitude ab- 
straction of probabilities, it is reasonable to score con- 
sequences on an integer scale of utility: very desirable 

u J ndesirable (U = -0(1/0), bear- 
able ((/ _ G>(1)), and so on, mapping each linguistic 
assessment into the appropriate ± 0(l/e') utility rat- 
ing. This utility rating, when combined with the in- 
finitesimal rating of probabilistic beliefs (Goldszmidt 
& Pearl 1992), should permit us to rate actions by the 
expected utility of their consequences, and a require- 
ment to do A would then be asserted iff the rating of 
doing A is substantially (i.e., a factor of 1/e) higher 
than that, of not doing A. 


This decision theoretic agenda, although conceptually 
straightforward, encounters some subtle difficulties in 
practice. First, when we deal with actions and conse- 
quences, we must resort to causal knowledge of the do- 
main and we must decide how such knowledge is to be 
encoded, organized, and utilized. Second, while theo- 
ries of actions are normally formulated as theories of 
temporal changes (Shoham 1988, Dean k Kanazawa 
1989), ought statements invariably suppress explicit 
references to time, strongly suggesting that temporal 
information is redundant, namely, it can be recon- 
structed if required, but glossed over otherwise. In 
other words, the fact that people comprehend, evalu- 
ate and follow non-temporal ought statements suggests 
that people adhere to some canonical, yet implicit as- 
sumptions about temporal progression of events, and 
that no account can be complete without making these 
assumptions explicit. Third, actions in decision the- 
ory are predesignated explicitly to a few distinguished 
atomic variables, while statements of the type “You 
ought to do A” are presumed applicable to any arbi- 
trary proposition A. 1 2 Finally, decision theoretic meth- 
ods, especially those based on static influence dia- 
grams, treat both the informational relationships be- 
tween observations and actions and the causal relation- 
ships between actions and consequences as instanta- 
neous (Chapter 6, Pearl 1988, Shachter 1986). In real- 

2 This has been an overriding assumption in both the 
deontic logic and the preference logic literatures. 
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ity, the effect of our next action might be to invalidate 
currently observed properties, hence any non-temporal 
account of ought must carefully distinguish properties 
that are influenced by the action from those that will 
persist despite the action, and must explicate therefore 
some canonical assumptions about persistence. 

These issues are the primary focus of this paper. We 
start by presenting a brief introduction to infinites- 
imal probabilities and showing how actions, beliefs, 
and causal relationships are represented by ranking 
functions /c(w) and causal networks T (Section 2). In 
Section 3 we present a summary of the formal results 
obtained in this paper, including an assertability crite- 
rion for conditional oughts. Sections 4 and 5 explicate 
the assumptions leading to the criterion presented in 
Section 3. In Section 4 we introduce an integer-valued 
utility ranking /i(u;) and show how the three compo- 
nents, «(u>), T, and ^(w), permit us to calculate, semi- 
qualitatively, the utility of an arbitrary proposition <p t 
the utility of a given action A, and whether we ought to 
do A. Section 5 introduces conditional oughts, namely, 
statements in which the action is contingent upon ob- 
serving a condition C. A calculus is then developed 
for transforming the conditional ranking /c(u>[C) into 
a new ranking Kyt(w|C) f representing the beliefs an 
agent will possess after implementing action A , hav- 
ing observed C. These two ranking functions are then 
combined with /j(cj) to form an assertability criterion 
for the conditional statement 0(A\C ): “We ought to 
do A, given C”. In Section 6 we compare our formu- 
lation to other accounts of ought statements, in par- 
ticular deontic logic, preference logic, counterfactual 
conditionals, and quantitative decision theory. 

2 INFINITESIMAL 

PROBABILITIES, RANKING 
FUNCTIONS, CAUSAL 
NETWORKS, AND ACTIONS 

1. (Ranking Functions). Let fi be a set of worlds, 
each world u E being a truth-value assignment to 
a finite set of atomic variables ( Xi,X2 , which 

in this paper we assume to be bi- valued, namely, 
Xi E {true, false}. A belief ranking function k(w) 
is an assignment of non-negative integers to the ele- 
ments of Q such that k(u>) = 0 for at least one lj E 
Intuitively, /c(w) represents the degree of surprise asso- 
ciated with finding a world w realized, and worlds as- 
signed k = 0 are considered serious possibilities. /c(u;) 
can be considered an order-of-magnitude approxima- 
tion of a probability function P(w) by writing P(w) as 
a polynomial of some small quantity c and taking the 
most significant term of that polynomial, i.e., 

P(u) S£ Ce*( u ) (1) 

Treating f as an infinitesimal quantity induces a condi- 
tional ranking function n(<fi\ip) on propositions which 
is governed by Spohn’s calculus (Spohn 1988): 

k(Q)= 0 


K ( _ ( min*, k(w) for «(=¥> 

\ oo for u [= 

K{<p\ip) = k(<p A VO - k(i/>) (2) 

2. ( Stratified Rankings and Causal Networks (Gold- 
szmidt & Pearl 1992)). A causal network is a directed 
acyclic graph (dag) in which each node corresponds to 
an atomic variable and each edge Xi — ► X ; asserts 
that Xi has a direct causal influence on Xj . Such net- 
works provide a convenient data structure for encoding 
two types of information: how the initial ranking func- 
tion k(w) is formed, and how external actions would 
influence the agent’s belief ranking /c(u>). Formally, 
causal networks are defined in terms of two notions: 
stratification and actions. 

A ranking function k(w) is said to be stratified relative 
to a dag T if 


k H = ^*(*i( w )IP a i( w )) (3) 


where pa t (u>) are the parents of Xi in T evaluated at 
state u>. Given a ranking function «(w), any edge- 
minimal dag T satisfying Eq. (3), is called a Bayesian 
network of /c(w) (Pearl 1988). A dag T is said to be a 
causal network of k(u;) if it is a Bayesian network of 
k(u>) and, in addition, it admits the following repre- 
sentation of actions. 

3. (Ac/ions) The effect of an atomic action do(Xi = 
frue) is represented by adding to T a link DO{ — *> 
Xi 7 where DOi is a new variable taking values in 
{do(xi), do(-^Xi), idle} and x* stands for Xi — true. 
Thus, the new parent set of Xi is paj = pa, U {DOi} 
and it is related to Xi by 


K(X,(w)|paJ(w)) = 

( /c(Xi(o;)|pa i (w)) if DOi — idle 
< oo if DOi = do(y) and Xi(u>) ^ y 
l 0 if DOi = do(y) and X t (u;) - y (4) 


The effect of performing action do(xi) is to transform 
k(w) into a new belief ranking, k X| (u;), given by 



« ; (W| d0(Xi)) 
OO 


for u> |= Xi 
for u (= -*Zi 


(5) 


where k' is the ranking dictated by the augmented 
network T U {DOi — * Xi } and Eqs. (3) and (4). 

This representation embodies the following aspects of 
actions: 


(i) An action do(xg) can affect only the descendants 
ofXiinT. 


(ii) Fixing the value of pa, (by some appropriate 
choice of actions) renders X, unaffected by any 
external intervention do(x K ),K ^ i. 
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3 SUMMARY OF RESULTS 


The assertability condition we are about to develop 
in this paper requires the specification of an epistemic 
state ES = (*(<*0, T, /i(u;)) which consists of three 
components: 


k(u>) - an ordinal belief ranking function on Q. 

T - a causal network of /c(u>). 

/i(w) - an integer-valued utility ranking of worlds, 
where p(w) =+ * assigns to w a utility 

f/(w)=tO(l/e*),i = 0 , 1 , 2 ,.... 


The main results of this paper can be summarized as 
follows: 

1. Let W+ and be the formulas whose models 
receive utility ranking and -i, respectively, and let 
k'(u>) denote the ranking function that prevails after 
establishing the truth of event p 1 where p is an arbi- 
trary proposition (i.e., K*{-*p) = oo and K f (p) = 0). 
The expected utility rank of p is characterized by two 
integers 

n + = maxi[0; i - k'(W? A p)\ 
n" = max*[0; i — k'(W~ A p)] (6) 

and is given by 


Mv, «-(„)] = I ambiguous 

2. A conditional ought statement 
in ES iff 


if > 0 

otherwise (7) 

0(A\C) is assertable 


k^(w|C)) > n(true ; *(u>|C)) (8) 


where A and C are arbitrary propositions and the 
ranking ka(uj\C) (to be defined in step 3) represents 
the beliefs that an agent anticipates holding, after im- 
plementing action A, having observed C. 


3. If A is a conjunction of atomic propositions, A = 
A j£j Aj t where each Aj stands for either Xj = true 
P r Xj = false , then the post-action ranking *a{u\C) 
is given by the formula 


Mw|C) = k(w) - K(*<(w)|pa,(u/)) + 

min w / r + «(u/|C)] (9) 

igJ 

where R is the set of root nodes and 

( Si if Xi(u) ^ Xi(u') and pa; = 0 

Situ w'l = i 8i ^ 0 and 

' /c(-.^i(u/)|pa,(w)) = 0 

0 otherwise V 1U / 


represents persistence assumptions: It is sur- 
prising (to degree s< > 1) to find X { change from its 
pre-action value of AT,(u/) to a post-action value of 
Xj(w) if there is no causal reason for the change. 


If A is a disjunction of actions, A = V, A 1 , where each 
A 1 is a conjunction of atomic propositions, then 

Ka(u\C) = min/c A i(o;|C) (11) 


4 FROM UTILITIES AND BELIEFS 
TO GOALS AND ACTIONS 


Given a proposition p that describes some condition or 
event in the world, what information is needed before 
we can evaluate the merit of obtaining p, or, at the 
least, whether p x is “preferred” to p 2 ? Clearly, if we 
are to apply the expected utility criterion, we should 
define two measures on possible worlds, a probabil- 
ity measure P(u) and a utility measure U(lj ). The 
first rates the likelihood that a world u? will be real- 
ized, while the second measures the desirability of w. 
Unfortunately, probabilities and utilities in themselves 
are not sufficient for determining preferences among 
propositions. The merit of obtaining p depends on at 
least two other factors: how the truth of p is estab- 
lished, and what control we possess over which model 
of p will eventually prevail. We will demonstrate these 
two factors by example. 

Consider the proposition p = “The ground is wet”. In 
the midst of a drought, the consequences of this state- 
ment would depend critically on whether we watered 
the ground (action) or we happened to find the ground 
wet (observation). Thus, the utility of a proposition p 
clearly depends on how we came to know p ) by mere 
observation or by willful action. In the first case, find- 
ing p true may provide information about the natural 
process that led to the observation p y and we should 
change the current probability from P(u) to P(u\p). 
In the second case, our actions may perturb the natu- 
ral flow of events, and P(u>) will change without shed- 
ding light on the typical causes of p. We will denote 
the probability resulting from externally enforcing the 
truth of p by P^w), which will be further explicated 
in Section 5 in terms of the causal network I\ 3 

However, regardless of whether the probability func- 
tion P{u\p) or Py>(w) results from learning p } we are 
still unable to evaluate the merit of p unless we un- 
derstand what control we have over the opportuni- 
ties offered by p. Simply taking the expected utility 
U(p) ~ S w [P(L>|<,p)C/(a>)] amounts to assuming that 
the agent is to remain totally passive until Nature se- 
lects a world u with probability P(uj\p) } as in a game of 
chance. It ignores subsequent actions which the agent 
might be able to take so as to change this probability. 
For example, event p might provide the agent with the 
option of conducting further tests so as to determine 
with greater certainty which world would eventually 
be realized. Likewise, in case p stands for “Joe went 
to get his gun” , our agent might possess the wisdom 
to protect itself by escaping in the next taxicab. 


3 The difference between P(u \p) and P<?(u>) is precisely 
the difference between conditioning and “imaging” (Lewis 
1973), and between belief revision and belief update (Al- 
chourron et.al. 1985, Katsuno L Mendelzon 1991, Gold- 
szmidt & Pearl 1992). It also accounts for the difference 
between indicative and subjunctive conditionals - a topic 
of much philosophical discussion (Harper et.al. 1980). 
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In practical decision analysis the utility of being in a 
situation <p is computed using a dynamic programming 
approach, which assumes that subsequent to realizing 
<p the agent will select the optimal sequence of actions 
from those enabled by p. This computation is rather 
exhaustive and is often governed by some form of my- 
opic approximation (Chapter 6, Pearl 1988). Ought 
statements normally refer to a single action A , tac- 
itly assuming that the choice of subsequent actions, if 
available, is rather obvious and their consequences are 
well understood. We say, for example, “You ought to 
get some food” , assuming that the food would subse- 
quently be eaten and not be left to rot in the car. In 
our analysis, we will make a similar myopic approxi- 
mation, assuming either that action A is terminal or 
that the consequences of subsequent actions (if avail- 
able) are already embodied in the functions P(w) and 
/i(w). We should keep in mind, however, that the re- 
sult of this myopic approximation is not applicable to 
all actions; in sequential planning situations, some ac- 
tions may be selected for the sole purpose of enabling 
certain subsequent actions. 


Denote by P'(w) the probability function that would 
prevail after obtaining p. 4 Let us examine next how 
the expected utility criterion U(<p) = EP'(w){/(u>) 
translates into the language of ranking functions. 

Let us assume that U takes on values in 
{ — 0(1/ e), 0(1), +0(1/6)}, read as {very undesirable, 
bearable, very desirable}. For notational simplic- 
ity, we can describe these linguistic labels as a util- 
ity ranking function /i(u?) that takes on the values 
— 1, 0, and -{-1, respectively. Our task, then, is to 
evaluate the rank /i(y?), as dictated by the expected 
value of U(u>) over the models of <p. 


Let the sets of worlds assigned the ranks — 1 , 0, and +1 
be represented by the formulas W~ , W° , and W+, re- 
spectively, and let the intersections of these sets with 
<p be represented by the formulas <p~ , y?°, and y?+, 
respectively. The expected utility of p is given by 
- C-/e P'(W-) + Co P'(W°) + C+/e P'(W+), 
where CL, Co, and C+ are some positive coefficients. 
Introducing now the infinitesimal approximation for 
P', in the form of the ranking function we obtain 


U(V) 


' -o( i/<) 

0 ( 1 ) 

+0(i/O 

k ambiguous 


if K*(<p~) = 0 

and *c'(y? + ) > 0 
if /c'(y?“) > 0 

and *'(y? + ) > 0 
if k'(<p~) > 0 

and k'(<p+) = 0 
if = 0 


( 12 ) 


The ambiguous status reflects a state of conflict 
U(<p) = —C-jt + C+/e, where there is a serious possi- 
bility of ending in either terrible disaster or enormous 
success. Recognizing that ought statements are of- 
ten intended to avert such situations (e.g., “You ought 


4 P'(u>) = P(u;\(p) in case <p is observed, and P'( u>) = 

P v (u>) in case p is enacted. In both cases P'(p) = 1. 


to invest in something safer”), we may take a risk- 
averse attitude and rank ambiguous states as low as 
U = — 0(l/e) (other attitudes are, of course, perfectly 
legitimate). This attitude, together with K*(p) = 0, 
yields the desired expression for /i(y?; 


/i(v?; k '( u >)) 


-1 if k'(W-\p) = Q 
0 if /c'(W- V W+ \p) > 0 
+ 1 if K'(W-\<p)>0 

and K f {W+\ip) = 0 


The three-level utility model is, of course, only a 
coarse rating of desirability. In a multi-level model, 
where W+ and are the formulas whose models 
receive utility ranking -f-i and — i, respectively 5 , and 
i = 0, 1, 2, ..., the ranking of the expected utility of 
<p is given by Eq. (7) (Section 3). 

Having derived a formula for the utility rank of an 
arbitrary proposition y?, we are now in a position to 
formulate our interpretation of the deontic expression 
0(A\C ): ‘You ought to do A if C, iff the expected 
utility associated with doing A is much higher than 
that associated with not doing A ” . We start with a 
belief ranking k(u>) and a utility ranking and 

we wish to assess the utilities associated with doing 
A versus not doing A, given that we observe C . The 
observation C would transform our current /c(u>) into 
k(u>|C). Doing A would further transform 
into #c'(w) = «^(a;|C), while not doing A would ren- 
der k(u>\C) unaltered, so *'(u ;) = /c(u/|C). Thus, 
the utility rank associated with doing A is given by 
^l{A\ k 1 A (yj\C)) y while that associated with not doing 
A is given by //(C; «(w|C)) = ti(true\ k(u>|C). Con- 
sequently, we can write the assertability criterion for 
conditional ought as 

0(A\C) iff h{A,k a {u\C)) > n(true\ k(u\C)) (14) 
where the function /i(y?;/c(w)) is given in Eq. (13). 

We remark that the transformation from /c(u/| C) to 
K 4 (u;|C) requires causal knowledge of the domain, 
which will be provided by the causal network T (Sec- 
tion 5). Once we are given T it will be convenient to 
encode both «(w) and /i(u>) using integer- valued labels 
on the links of F. Moreover, it is straightforward to 
apply Eqs. (7) and (14) to the usual decision theo- 
retic tasks of selecting an optimal action or an opti- 
mal information-gathering strategy (Chapter 6, Pearl 
1988). 


Example 1: 

To demonstrate the use of Eq. (14), let us examine the 
assertability of “If it is cloudy you ought to take an 
umbrella” (Boutilier 1993). We assume three atomic 
propositions, c - “Cloudy”, r - “Rain”, and u - “Hav- 
ing an Umbrella”, which form eight worlds, each corre- 
sponding to a complete truth assignment to c, r, and u. 

5 In practice, the specification of U( u/) is done by defin- 
ing an integer- valued variable V (connoting “value”) as a 
function of a select set of atomic variables. W+ would 
correspond then to the assertion V = +», » = 0, 1, 2, .... 
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To express our belief that rain does not normally occur 
in a clear day, we assign a k value of 1 (indicating one 
unit of surprise) to any world satisfying r A -«c and a 
k value of 0 to all other worlds (indicating a serious 
possibility that any such world may be realized). To 
express the fear of finding ourselves in the rain with- 
out an umbrella, we assign a /i value of -1 to worlds 
satisfying r A —'ll and a fi value of 0 to all other worlds. 
Thus, W+ = false , W° — -»(rA-»u), and W~ — rA->u. 

In this simple example, there is no difference between 
k a {v) and k(u\A) because the act A = “Taking an 
umbrella” has the same implications as the finding 
Having an umbrella”. Thus, to evaluate the two ex- 
pressions in Eq. (14), with A — u and C ~ c, we first 
note that 

k(W~ |u, c) = /c(r A -<u|u, c) — co 

k(W~ VW+\xi,c) = oo 
so 

^(u;«(w|u,c)) = 0 

Similarly, 

k(W~ |c) = «(r A -«u|c) = 0 

hence 

/i(c;/c(w|c)) = -1 ( 15 ) 

Thus, 0(u\c) is assertable according to the criterion of 
Eq. (14). 

Note that although «(w) does not assume that nor- 
mally we do not have an umbrella with us (k(u) > 0), 
the advice to take an umbrella is still assertable, since 
leaving u to pure chance might result in harsh conse- 
quences (if it rains). 

Using the same procedure, it is easy to show that the 
example also sanctions the assertability of 0(-^r|c, ->u), 
which stands for w If it is cloudy and you don’t have an 
umbrella, then you ought to undo (or stop) the rain” . 
This is certainly useless advice, as it does not take into 
account one’s inability to control the weather. Con- 
trollability information is not encoded in the ranking 
functions k and \l\ it should be part of one’s causal 
theory and can be encoded in the language of causal 
networks using costly preconditions that, until satis- 
fied, would forbid the action do(A) from having any 
effect on A. 6 

5 COMBINING ACTIONS AND 
OBSERVATIONS 

In this section we develop a probabilistic account for 
the term k a(u>\C) } which stands for the belief ranking 

6 In decision theory it is customary to attribute direct 
costs to actions, which renders ^(a/) action-dependent. An 
alternative, which is more convenient when actions are not 
enumerated explicitly, is to attribute costs to precondi- 
tions that must be satisfied before (any) action becomes 
effective. 



Figure 1. Persistence interactions between two causal 
networks 


that would prevail if we act A after observing C, i.e., 
the A-update of fc(w|C). First we note that this up- 
date cannot be obtained by simply applying the up- 
date formula developed in (Eq. (2.2), Goldszmidt k 
Pearl 1992), 

k a (w) = / *(") - «(^|P« A (w)) u, \= A 

V ' \ oo u (= (16) 

where pa^w) are the parents (or immediate causes) 
of A in the causal network T evaluated at u>. The 
formula above was derived under the assumption that 
T is not loaded with any observations (e.g., C) and 
renders k a (u) undefined for worlds w that are excluded 
by previous observations and reinstated by A. 

To motivate the proper transformation from «(w) to 
k . 4 (w|C), we consider two causal networks, F' and T 
respectively representing the agent’s epistemic states 
before and after the action (see Figure 1). Although 
the structures of the two networks are almost the same 
(r contains additional root nodes representing the ac- 
tion do(A)), it is the interactions between the corre- 
sponding variables that determine which beliefs are 
going to persist in T and which are to be “clipped” by 
the influence of action A. 

Let every variable X[ in T' be connected to the corre- 
sponding variable AT, in T by a directed link X- — ► AT* 
that represents persistence by default, namely, the nat- 
ural tendency of properties to persist, unless there is 
a cause for a change. Thus, the parent set of each A, 
in T has been augmented with one more variable: X-. 
To specify the conditional probability of AT,, given its 
new parent set (pa^ U X - } , we need to balance the 
tendency of X, to persist (i.e., be equal to Xf) against 
its tendency to obey the causal influence exerted by 
P a jr, • W e will assume that persistence forces yield to 
causal forces and will perpetuate only those properties 
that are not under any causal influence to terminate. 
In terms of ranking functions, this assumption reads: 

A^MIpa^w), AT((w')) = 

{ Si if pa,- = 0 and Xi(u ) ^ X,(w') 

s i + K (-X»'( w )|paj(w)) if X ’,-(w) ^ X^w') and 
/c(-’X j (w)|pa,-(w)) = 0 

«(Aj (w)|pa,(w)) otherwise ^7) 

where w' and u specify the truth values of the variables 
in the corresponding networks, T' and T, and s, > 1 is 
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a constant characterizing the tendency of X; to persist. 

Eq. (17) states that the past value of X, may affect the 
normal relation between X, and its parents only when 
it differs from the current value and, at the same time, 
the parents of X, do not compel the change. In such 
a case, the inequality X;( w) ^ X'(o/) contributes Sj 
units of surprise to the normal relation between X, and 
its parents. The unique feature of this model, unlike 
the one proposed in (Goldszmidt &; Pearl 1992), is that 
persistence defaults can be violated by causal factors 
without forcing us to conclude that such factors are 
abnormal. 

Eq. (17) specifies the conditional rank /c(X|pa^) for 
every variable X in the combined networks and, hence, 
it provides a complete specification of the joint rank 
k(u>,u/). 8 The desired expression for the post-action 
ranking k^(u;) can then be obtained by marginalizing 
k(w, w') over u>': 

k a (cj) = min /c(w, a/) (18) 

We need, however, to account for the fact that some 
variables in network T are under the direct influence 
of the action A , and hence the parents of these nodes 
are replaced by the action node do(A). If A con- 
sists of a conjunction of atomic propositions, A = 

A jtjAj, where each Aj stands for either Xj = true 
or Xj = false , then each Xj, i G J , should be ex- 
empt from incurring the spontaneity penalty speci- 
fied in Eq. (17). Additionally, in calculating k(w,u/) 
we need to sum /c(Xj(w)|pa,-(u>), X t '(u/)) only over 
i £ J, namely, over variables not under the direct 
influence of A. Thus, collecting terms and writing 
/c(cj) = ^(X^cjJlpa^u;)), we obtain 

M«|C) = *( w )- X) K(Xi(w)|pa,-(w))+ 

i€JUR 

mi Tkj* c S<(w f w') + «(u'\C)) (19) 

w 

where R is the set of root nodes and 

{ Si if Xj(u>) ^ Xj(u/) and pa } = 0 
Si if Xj(w) ^ Xj(w'), pa, ^ 0 and 
*(^(«)|pa j («)) = 0 
0 otherwise (20) 

7 This is essentially the persistence model used by Dean 
and Kanazawa (Dean L Kanazawa 1989), in which s, rep- 
resents the survival function of X,. The use of ranking 
functions allows us to distinguish crisply between changes 
that are causally supported, *(-«Xt(u/)|pa ( (u;)) > 0, and 
those that are unsupported, «(-»X,(u;)|pa l (w)) = 0. 

8 The expressions, familiar in probability theory, 

= JJP(X J (w,u)')|p a J (a;, w')), P{ w) = £/>(«.«') 

, 

translate into the ranking expressions 

k(u>,u>') = ^K^Xw.w'^lpa^w.w')), k(uj) = minic(w,w') 

3 

where j ranges over all variables in the two networks. 


Eq. (19) demonstrates that the effect of observations 
and actions can be computed as an updating opera- 
tion on epistemic states, these states being organized 
by a fixed causal network, with the only varying el- 
ement being k, the belief ranking. Long streams of 
observations and actions could therefore be processed 
as a sequence of updates on some initial state, without 
requiring analysis of long chains of temporally indexed 
networks, as in Dean and Kanazawa (1989). 

To handle disjunctive actions such as “Paint the wall 
either red or blue” one must decide between two in- 
terpretations: “Paint the wall red or blue regardless 
of its current color” or “Paint the wall either red or 
blue but, if possible, do not change its current color” 
(see Katsuno & Mendelzon 1991 and Goldszmidt & 
Pearl 1992). We will adopt the former interpretation, 
according to which u do{A V B)” is merely a shorthand 
for u do(A) V do{By\ This interpretation is particu- 
larly convenient for ranking systems, because for any 
two propositions, A and B, we have 

*c(A VB) = min[«(A); k(B)] (21) 

Thus, if we do not know which action, A or B, will 
be implemented but consider either to be a serious 
possibility, then 

kavb(w) = min[/c^(w); « B (w)] (22) 

Accordingly, if A is a disjunction of actions, A = 
\j l A\ where each A 1 is a conjunction of atomic propo- 
sitions, then 

ka{u\C) = min/c A i(u?|C) (23) 

Example 2 

To demonstrate the interplay between actions and ob- 
servations, we will test the assertability of the following 
dialogue: 

Robot 1: It is too dark in here. 

Robot 2: Then you ought to push the switch up. 
Robot 1: The switch is already up. 

Robot 2: Then you ought to push the switch down. 

The challenge would be to explain the reversal of the 
“ought” statement in response to the new observation 
“The switch is already up”. The inferences involved 
in this example revolve around identifying the type of 
switch Robot 1 is facing, that is whether it is normal 
(n) or abnormal (-»n) (a normal switch is one that 
should be pushed up (it) to turn the light on (/)). The 
causal network, shown in Figure 2, involves three vari- 
ables: 

L - the current state of the light (/ vs -*/), 

S - the current position of the switch (it vs -»it), and 
T - the type of switch at hand ( n vs -»n). 

The variable L stands in functional relationship to 5 
and T, via 

/ = (nAu)V (“»n A _| it) (24) 
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or, equivalently, k = oo unless / satisfies the relation 
above. 


Since initially the switch is believed to be normal, 
we set «(-»n) = 1, resulting in the following initial 
ranking: 

K(UJ) 


U 

“■U 

u 

-»u 


n 

n 

■« n 



Figure 2: Causal network for Example 2 

We also assume that Robot 1 prefers light to darkness, 
by setting 

/ \ _ / — 1 if w t -i / 

| o if w I = / ( 25 ) 

The first statement of Robot 1 expresses an observa- 
tion C = -»/, yielding 

{ 0 for w = A n A -«/ 

1 for a; = u A -»n A -•/ (26) 

oo for all other worlds 

To evaluate k a {<jj\C) for A = u, we now invoke Eq. 
(19), using the spontaneity functions 

St(w,u/) = 1 if T(u) ± T{J) 

Sl(w,u/) = 0 if L(uj) t L{J) (27) 

because L{u), being functionally determined by 
pa^(a>) is exempt from conforming to persistence de- 
faults. Moreover, for action A = u we also have 
^(ulpa^) = k(u) = 0, hence 

/c^MC) = *(w) - «(T(w)) 

min {I[T(u) ^ TV)] + «(w'|C)}. 

for w = wi,u >2 (28) 

where I\p] equals 1 (or 0) if p is true (or false), and 

u>i = u A n A / wj = A n A -i( 

cj2 = u A A i/ c^2 = u A -»n A -»/ ^ ' 

All other worlds are excluded by either A = u or 

C = ^l. 


Minimizing Eq. (19) over the two possible u/ worlds, 
yields 


*a(u\C) 


f 0 for w = lj i 

( 1 for w =s u >2 


(30) 


We see that a >2 = u A ->n A is penalized with one 
unit of surprise for exhibiting an unexplained change 
in switch type (initially believed to be normal). 

It is worth noting how which originally was ruled 
out (with k = oo) by the observation -■/, is suddenly 
reinstated after taking the action A = u. In fact, Eq. 
(19) first restores all worlds to their original k(u) value 
and then adjusts their value in three steps. First it 
excludes worlds satisfying -~*A, then adjusts the k(l;) of 
the remaining worlds by an amount ^(Alpa^u/)), and 
finally makes an additional adjustment for violation of 
persistence. 

From Eqs. (26) and (28), we see that k a (1\C) = 0 < 
k(/|C) = oo, hence the action A = u meets the as- 
sertability criterion of Eq. (14) and the first statement, 
“You ought to push the switch up”, is justified. At this 
point, Robot 2 receives a new piece of evidence: S = u. 
As a result, k(w|-</) changes to u) and the cal- 

culation of ^(wIC) needs to be repeated with a new 
set of observations, C = -*/ A u. Since k(c*/|-»/, u) per- 
mits only one possible world u/ = u A -m A -»/, the 
minimization of Eq. (19) can be skipped, yielding (for 
A = -hi) 


k a (w\C) 


f 0 for w 
y 1 for w 


~*u A ~*n A / 
— 'iz AnA^l 


(31) 


which, in turn, justifies the opposite “ought” state- 
ment (“Then you ought to push the switch down”). 
Note that although finding a normal switch is less 
surprising than finding an abnormal switch, a spon- 
taneous transition to such a state would violate per- 
sistence and is therefore penalized by obtaining a k 
ofl. 


6 RELATIONS TO OTHER 
ACCOUNTS 

6.1 DEONTIC AND PREFERENCE 
LOGICS 

Ought statements of the pragmatic variety have been 
investigated in two branches of philosophy, deontic 
logic and preference logic. Surprisingly, despite an 
intense effort to establish a satisfactory account of 
“ought” statements (Von Wright 1963, Van Fraassen 
1973, Lewis 1973), the literature of both logics is 
loaded with paradoxes and voids of principle. This 
raises the question of whether “ought” statements are 
destined to forever elude formalization or that the 
approach taken by deontic logicians has been misdi- 
rected. I believe the answer involves a combination of 
both. 
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Philososphers hoped to develop deontic logic as a sep- 
arate branch of conditional logic, not as a synthetic 
amalgam of logics of belief, action, and causation. 
In other words, they have attempted to capture the 
meaning of “ought” using a single modal operator 0(), 
instead of exploring the couplings between “ought” 
and other modalities, such as belief, action, causation, 
and desire. The present paper shows that such an 
isolationistic strategy has little chance of succeeding. 
Whereas one can perhaps get by without explicit refer- 
ence to desire, it is absolutely necessary to have both 
probabilistic knowledge about the effect of observa- 
tions on the likelihood of events and causal knowledge 
about actions and their consequences. 

We have seen in Section 3 that to ratify the sentence 
“Given C, you ought to do A ” , we need to know not 
merely the relative desirability of the worlds delineated 
by the propositions A A C and -iA A (7, but also the 
feasibility or likelihood of reaching any one of those 
worlds in the future, after making our choice of A. We 
also saw that this likelihood depends critically on how 
C is confirmed, by observation or by action. Since this 
information cannot be obtained from the logical con- 
tent of A and C, it is not surprising that “almost ev- 
ery principle which has been proposed as fundamental 
to a preference logic has been rejected by some other 
source” (Mullen 1979). 

In fact, the decision theoretic account embodied in Eq. 
(14) can be used to generate counterexamples to most 
of the principles suggested in the literature, simply by 
selecting a combination of «, and T that defies the 
proposed principle. Since any such principle must be 
valid in all epistemic states and since we have enor- 
mous freedom in choosing these three components, it 
is not surprising that only weak principles, such as 
0(A\C) => -">0(-»A|<7), survive the test. Among the 
few that do survive, we find the sure-thing principle: 

0(A\C) A 0(A\^C) => 0(A) (32) 

read as “If you ought to do A given C and you ought 
to do A given -»C, then you ought to do A without 
examining (7”. But one begins to wonder about the 
value of assembling a logic from a sparse collection of 
such impoverished survivors when, in practice, a full 
specification of /c, /i, and T would be required. 

6.2 COUNTERFACTUAL CONDITIONALS 

Stalnaker (1972) was the first to make the connection 
between actions and counterfactual statements, and he 
proposed using the probability of the counterfactual 
conditional (as opposed to the conditional probability, 
which is more appropriate for indicative conditionals) 
in the calculation of expected utilities. Stalnaker ’s the- 
ory does not provide an explicit connection between 
subjunctive conditionals and causation, however. Al- 
though the selection function used in the Stalnaker- 
Lewis nearest- world semantics can be thought of as a 
generalization of, and a surrogate for, causal knowl- 
edge, it is too general, as it is not constrained by the 


basic features of causal relationships such as asym- 
metry, transitivity, and complicity with temporal or- 
der. To the best of my knowledge, there has been no 
attempt to translate causal sentences into specifica- 
tions of the Stalnaker-Lewis selection function. 9 Such 
specifications were partially provided in (Goldszmidt 
& Pearl 1992), through the imaging function u>*(u/), 
and are further refined in this paper by invoking the 
persistence model (Eq. (19)). Note that a directed 
acyclic graph is the only ingredient one needs to add 
to the traditional notion of epistemic state so as to 
specify a causality-based selection function. 

From this vantage point, our calculus provides, in 
essence, a new account of subjunctive conditionals that 
is more reflective of those used in decision making. The 
account is based on giving the subjunctive the follow- 
ing causal interpretation: “Given C, if I were to per- 
form A, then I believe B would come about”, written 
A > B\C , which in the language of ranking function 
reads 

k(^B\C) = 0 and k a (^B\C) > 0 (33) 

The equality states that -*B is considered a serious 
possibility prior to performing A, while the inequal- 
ity renders surprising after performing A. This 
account, which we call Decision Making Conditionals 
(DMC), avoids several paradoxes of conditional log- 
ics (see Nute 1992) and is further described in (Pearl 
1993). 

6.3 OTHER DECISION THEORETIC 
ACCOUNTS 

Poole (1992) has proposed a quantitative decision- 
theoretic account of defaults, taking the utility of A, 
given evidence e, to be 

f*(A\e) = E* /i(w, A)P(u |e) (34) 

This requires a specification of an action-dependent 
preference function for each (w, A) pair. Our proposal 
(in line with (Stalnaker 1972)) attributes the depen- 
dence of /j on A to beliefs about the possible conse- 
quences of A, thereby keeping the utility of each conse- 
quence constant. In this way, we see more clearly how 
the structure of causal theories should affect the choice 
of actions. For example, suppose A and e are incom- 
patible (“If the light is on (e), turn it off (A)”), taking 
(34) literally (without introducing temporal indices) 
would yield absurd results. Additionally, Poole’s is a 
calculus of incremental improvements of utility, while 

*Gibbard and Harper (Gibbard &s Harper 1980) develop 
a quantitative theory of rational decisions that is based 
on Stalnaker’s suggestion and explicitly attributes causal 
character to counterfactual conditionals. However, they 
assume that probabilities of count erf actuals are given in 
advance and do not specify either how such probabilities 
are encoded or how they relate to probabilities of ordinary 
propositions. Likewise, a criterion for accepting a counter- 
factual conditional, given other counterfactuals and other 
propositions, is not provided. 
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ours is concerned with substantial improvements, as is 
typical of ought statements. 

Boutilier (1993) has developed a modal logic account 
of conditional goals which embodies considerations 
similar to ours. It remains to be seen whether causal 
relationships such as those governing the interplay 
among actions and observations can easily be encoded 
into his formalism. 


7 CONCLUSION 

By pursuing the semantics of ought statements this pa- 
per develops an account of qualitative decision theory 
and a framework for qualitative planning under uncer- 
tainty. The two main features of this account are: 

1 Order-of-magnitude specifications of probabilities 
and utilities are combined to produce qualitative ex- 
pected utilities of actions and consequences, condi- 
tioned on observations (Eq. (7)). 

2. A single causal network, combined with universal 
assumptions of persistence is sufficient for specifying 
the dynamics of beliefs under any sequence of actions 
and observations (Eq. (9)). 
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